Hello, everyone. So this talk won't be quite like other talks you might have seen around.
It's going to be very demo heavy. And it's on 3D web visualization. So congratulations
those of you who made it. This is one of DEF CON's unofficial scavenger hunts, finding
the talk that isn't listed. So if you're in the audience, you're a winner.
And if you're not in the audience and you're watching it on video, well, better luck next
time. All right. So I'm Alejandro, call me Alex.
I'm the owner of Hyperion Gray. I'm the backend developer for the tool that you're about to
see.
3D. I'm interested in applying distributed computing as it relates to breaking things
and finding vulnerabilities in things. So you're going to see a lot of that throughout
the talk. It's kind of a theme of mine, if you will. Originally a web app pen tester
by trade, started focusing on software development after that. So hope you like the tool.
And my name is Teal Rogers. I am a maker and an interface developer specializing in 3D.
We're actually a little bit ahead of time. So this is a ‑‑ it's a 3D visualization
of the web, which is so ‑‑ they're not on the screen. Well, let's see about that.
There we go. No, it's not really a big deal.
We actually just have one slide and then we just get into the demo.
Yeah. Yeah. If you want to read about the problem, we're solving ‑‑
Yeah. It should be okay. All right. So once again, congratulations,
those of you who made it. And we're just going to jump right into it, show you the
program. And take you around.
So this environment is ‑‑ you know what I can do? If it cuts off the edge a little
bit, I can move this over. Does that look good?
A little bit more.
All right. So this is basically our application. It is a ‑‑ it's a 3D environment and what
you see here is domains. Domains are represented in ‑‑ as globes with, you know, graphics
and stuff. And when a domain spawns, there is smaller balls which represent pages. So
this is not attempting to reproduce a web browser or anything.
It is giving metadata for the Internet. Sort of a view that no one else is doing. There's
things that you can see from this ‑‑ this organization that you can't see anywhere
else except by, like, really digging into your HTML, digging into what your code is,
doing vulnerability scans, using scanning software.
Okay.
And frankly, most people don't do. So we wanted to make it easy to do that.
And this is what this is designed to do.
Yeah. And right now what you're seeing, all this is running in a test environment
of ours. We have about 15 to 20 websites. Some of those websites have various vulnerabilities
in them. Some of them have misconfigurations. Some of them just have really messed up configurations,
just little things that make it an odd site. So along with that, we do have a few production
websites. So we'll point those out when we go to them. If you're checking them out,
like, on your smart phone, you can see those. Things like HyperionGrey.com, TrinarySoftware.com.
Some of them are Internet facing, and some of them are not.
So the first site we wanted to show you was DC Graphics. And I'll just show you real quick
in the web browser, it's a very small site. It's basically a joke site, really. It's two
posts on WordPress. And I don't know what it is. It's a very small site. It's basically
a joke site, really. It's two posts on WordPress. And some of these sites are very, very small.
I'll just give it a second here to reconnect. See, the connection is right there. So it's
a very small site. In our environment we do a crawl in advance and it maps all the links.
So you can see a line from one page to another page is a link. And they're all directional
because links are directional. So the skinny end is where it's pointing to, the fat end.
It's kind of like an arrow. So this is a typical sort of WordPress site if it's very small.
Larger WordPress sites tend to have a core and then an outer core. This one just has
an inner core because it's too small to have two cores. So you see the cores is these sites
here. And on the outside you see some feed sites, which that's what WordPress does. It
creates feed sites.
Yeah, so on the back end the crawler is running continuously to keep everything
updated. So the goal is to give you a continuously updated view of the websites that you're seeing.
Right now you're actually just seeing a snapshot in time. But the way that it's built on the
back end is it's a Hadoop‑based web crawler slash vulnerability scanner. So we can keep
track of a ton of websites over a short period of time. And the more websites we keep track
of, all we basically have to do is scale up the cluster, which is really, really easy.
And that will shorten our crawling times and collection of metadata and all that stuff.
So you might recognize that as somewhat similar to like the Google model, how they're collecting
metadata and collecting websites. Actually the back end is ‑‑ it's an open‑source
implementation of exactly how Google is doing it.
So here's another site. It's another joke site, really. It will spawn in a second here.
It's also written in WordPress. It's just a little bit bigger. It's got a few more posts.
And in a little bit here, you'll notice something strange, which is why we're focusing on this
one at the moment.
It has this link that will appear to a domain called 1.gravatar.com.
And I made this site.
And I don't know what 1.gravatar.com is.
I haven't dug through the source code.
The only way I know that it's there is through this APR.
See, there it is.
See, we don't actually crawl 1.gravatar.com.
That's why it looks like a distortion.
But one of the advantages of seeing the metadata of the Internet, the Internet's underbelly,
is you get to see that there's these weird links all over the place that you didn't even
know were there.
Even if you made the site, you didn't know they were there.
Yeah.
By the way, does anybody know what 1.gravatar.com is, by chance?
Anybody want to go to it?
This guy knows.
It's the picture.
So your e-mail is used for your user name.
Do you recognize that?
It's a picture site.
All right.
So we needed a 3D map and a dude in the audience to tell us what that was.
So that's cool.
Yeah.
Yeah.
Awesome.
So next up, Teal's going to show us punk spider.hyperion.com.
This is an example of a live production site that's out there.
Has anybody heard of the punk spider project, by chance?
No?
Nobody?
That guy?
No.
He was just fidgeting a little.
Yeah.
My girlfriend.
Yeah.
So this is punk spider.
It's a distributed vulnerability search engine, which was kind of a precursor to Web 3.0.
It's not in 3D.
It's not quite this fancy, but it does use on the back end a distributed vulnerability
scanner that I wrote that gives you back vulnerabilities on websites.
Much like Web 3.0 does, which we'll get into a little bit later.
But on the back end, Web 3.0 is a little bit fancier than punk spider is.
It kind of froze up here a little bit.
But this is a production platform.
Not that one.
And we're still tracking down all the stuff.
It's really easy to restart it.
Yeah.
It's a prototype.
Sorry to interrupt you.
No.
You're fine.
Thank you.
So I was saying about punk spider.
No.
I was saying about Web 3.0.
So on the back end, we're using a distributed HBase back end.
So you already might notice the theme here, right?
Everything that I write on the back end is completely distributed.
So what that means, if you're not familiar with HBase, is it's a huge key value store
that runs on a Hadoop cluster.
So the more ‑‑ the more keys and values you have, the more you can just ‑‑ the
more you can scale up your cluster by adding a machine, which again is really easy, takes
about a minute to add a machine, and makes it infinitely scalable.
So the more data we have, just have to scale up our cluster, since actually most of this
stuff is in the cloud, that really just takes like 30 seconds to a minute.
So if you notice, sometimes domains disappear.
Sometimes they reappear.
This is controlled by interest.
Whatever domain you have focused in the center of the view is the domain you're interested
in.
And it's a value that the interface keeps track of just to keep the screen less cluttered.
And so as domains lose interest, as you're not focused on them, they disappear from the
interface and then they reappear.
So here we have a vulnerable domain.
That's why it's spewing stuff.
It's vulnerable to SQL injection.
This particular domain we made on purpose to be vulnerable.
And that's why it's called SQLI1.
And you ‑‑ there's another thing you can't really see from just looking at HTML,
looking at the page.
You can't see whether your site is vulnerable to hacks.
And with our Web 3.0 visualizer, you can see an overview of whether you're linked
to sites that happen to be vulnerable.
vulnerable or whether you just find sites, you know, randomly on the Internet. You want
to know whether they're vulnerable or not for various reasons. And so here we have HyperionGray.com
which is Alex's site. It is ‑‑ Yeah, actually one quick note about the
vulnerability scanner. So every site that goes through the system gets scanned for vulnerabilities.
The base of the vulnerability scanner right now is done. It's still pretty basic. It's
essentially just a little fuzzer that goes through get parameters. But of course we're
expanding that and making it a much, you know, fancier vulnerability scanner. So the way
it works is a little bit unique. So a web crawler, when you actually go out and crawl
sites with a web crawler, you're collecting a ton of metadata on those sites. What the
scanner does is it makes vulnerabilities a completely integral part of that metadata.
So you essentially don't crawl unless you're looking for vulnerabilities in a site, which
is pretty cool. And, again, going along with my theme, this is a fully distributed vulnerability
scanner. So the more nodes we have in our cluster on the back end, the faster we can
scan. So that makes it really useful. Essentially we can scale this up and keep track of not
only a map of the entire Internet, but we can scan the entire Internet for vulnerabilities,
which is pretty cool. So as you can tell, there's a lot of stuff
here. Alex is all about the scanner and the vulnerability. I'm focused on the 3D engine.
So here we have Alex's site, which is built in Drupal. And this is one of the peculiarities
you can find with Drupal sites sometimes, is that you'll find, like, this area here,
which is kind of weird URLs that don't really say much, node 26, node 29. And you can see
that. And then you'll find longer URLs, human readable URLs over here. So Drupal is creating
these weird little URLs and just forwarding them to the longer ones. And this kind of
creates a kite, a sort of a main page over here, and then a kite in the background, which
is really funny, actually. But this isn't really the view that you would see as a user
of the site.
You would see ‑‑ you would just see this, the stuff on the right. And as a crawler,
though, as Google, as ‑‑ or as, you know, even somebody collecting information
for whatever purposes you might collect information for a site, you would see ‑‑ you would
want to know that there's this kind of little odd structure going on here. Because, you
know, information is power.
Yeah. So that's the ‑‑
Now Teal is going to take us to site bushofficial.com. So Bushofficial is an example of a live production
site that we actually do not own. We've just kind of sample crawled the site. So this actually
isn't the entire site. It is the official site of the band Bush. Do we have any Bush
fans out there?
Yeah.
Okay.
Yeah. I've never actually heard Bush.
Yeah.
But I think they're probably a great band.
Yeah. And we're sure they're lovely people.
And, yeah, lovely people.
Yeah.
So ‑‑
So the system is able to, as I mentioned, in a very non‑inventive way ‑‑
Yeah.
Check for vulnerabilities in these sites. We're doing some really respectful stuff from
a network standpoint. We're respecting robots.txt even during vulnerability scanning throughout
the entire thing. And we never flood the site with traffic in any way. So you'll see
just a few vulnerable links pop up. These are URLs in the domain with SQL injection.
They are real, but don't misuse this. Don't be a dick, I guess is all I'm trying to say.
So, yeah.
Yeah.
Yeah. So you can see Bush is kind of a typical site. They love MySpace and they love Facebook
back here even though the labor ‑‑ and here's Twitter. They love Twitter. And so
they're showing a lot of love for social networks, which, well, that's pretty typical of a band.
Twitter is, of course, a massive site and we don't crawl it even though it kind of looks
like we do. We just crawl if they happen to link to it. And over here we find someone
else who happens to link to it. What's the meaning of the distance?
Well, the distance is pretty random. You can actually take domains and just drag them around
with the mouse. So you can grab Twitter and drag it over here. It's all organized dynamically
based on what you want to do with it. It's all organized dynamically based on what you
want to do with it. It's based on whether it's connected to anything that happens to
be shown on the screen at that time. So here we have the DEF CON website, which
is really quite interestingly organized. It is ‑‑ there's a core of pages here,
which is the main site. And then here's a satellite page. So this isn't a full crawl
of the DEF CON website or you would see a bunch of different satellite pages also. But
DEF CON is one of the few sites that actually does this on purpose. Many sites do this by
accident. So you can see DEF CON shows a lot of love to Facebook. It shows a lot of
love to Twitter. And it shows a little bit of love to Amazon, just in this one little
URL, links slash book list dot HTML. So you can see where their priorities are.
Yeah, so we still have a lot of information. Yeah.
Plenty of time, actually. We probably have some time for questions in the end. I think
Teal is just going to show you www.trinarysoftware.com real quick and tell you a little bit about
how you can get involved with the project. We definitely need your help. So, Teal, you
want to tell them a little bit about that? All right. So this is my website that I've
made in, like, the last week. And if you look at the structure in our web 3.0, we have
a viewer. You see some really interesting things, actually. And you'll see that there's
this structure here, which is pretty normal. But then there's this, like, weird little
structure off to the side. And this is actually non www.trainerysoftware.com. So, what I've
done ‑‑ and I haven't corrected it, just so you can show. I have accidentally ‑‑ this
mistake here, linked some of my pages to a different domain, a non-WWW domain. And as
far as like Google is concerned, this is like a big no‑no. And Google is kind of like
the government of the Internet, so to speak. And so this sort of mistake you can make will
often get you Google find. And I made up that term. I'm hoping it sticks, but that isn't
an actual term. So it's a good example of why you would want to get a map of your website.
Because if you just crawl manually through your own ‑‑ through the web browser,
examine your HTML, you're very likely to get a map of your website. You're very likely to
miss the fact that you stuck some of the pages in non‑WWW format. Now, fortunately,
my site doesn't have any SQL injection vulnerabilities on it. Or you'd be able to see that as well.
So what we have here, what this software is, is it's a prototype. It's ‑‑ it's under
active development, and there's a lot of directions that we could take this. We want ‑‑ we
want our friends to get involved. And by our friends, I mean all of you guys. We want
‑‑ so what we've done, what I've done is we've made this ‑‑ we've made a mailing
list for now. So if you ‑‑ on the site, on trinary software.com, you can sign up for
a mailing list, and we're going to be offering everyone who is on the mailing list in a month,
or probably two months, free access to the closed beta. So we're ‑‑ we really want
you to be involved, and for contextual reasons, we can't really offer an open beta, but we
want to offer everyone here free access to the closed beta.
Yeah. And for the back‑end engine, actually, I'm releasing this free and open
source under the Apache license, so you can do whatever you want with it. I know Teal's
also going to offer a free version of this when it does actually come out. Yeah, and
thanks for coming. One last note, if you're interested in offensive techniques in distributed
computing, I have another talk here that's coming up at 3 p.m. at track one. So definitely
catch that if that sounds like something that's interesting to you. And thanks for coming.
Thanks a lot, guys.
We actually have about five minutes where we can take some questions.
Where do you go to opt‑in for the beta? It's www.trinarysoftware.com.
T‑R‑I ‑‑ Well, here it is, right here, actually.
Right on the top there.
So ‑‑
It's actually very simple. It's T‑R‑I‑N‑A‑R‑Y, trinary, like three nary's.
All right. So trinary software is a technically meaningless term in almost any kind of functional
sense, unless you ‑‑ unless you think of it in terms of 3‑D. So that's really
the only way that you could take what is, you know, what is a, you know, a, you know,
what is a trinary software. It's a 3‑D software.
Other questions?
What?
So actually it's a customized version of a crawler called Apache NUTCH, which is where
the Apache Hadoop project actually came from. It spawned from Apache NUTCH. So we've customized
it, added a bunch of plug‑ins on the back end and, again, releasing all that stuff open
source after.
Yeah.
When will you be showing off the crawler?
I don't know.
I don't know.
I will be showing off a little bit more about the crawler and vulnerability scanner in my
talk at 3 p.m. on track one. Thanks, John.
Anybody else?
Yeah.
Have you thought about rendering it 3‑D and, like, using a leak motion to control
it?
Using what kind of motion?
Leak motion is a thing that can actually attack hands.
Yeah.
.
.
It could ‑‑ we can use, like, biomechanics.
We can use biofeedback and leak motion. And that is actually really easy to plug in.
And that's a direction we can take this. We're trying to figure out what directions.
We want all the good suggestions like that that we can get. And any kind of, you know,
contributions, any kind of input that anybody can get, feel free to e‑mail us or our
e‑mails are listed on our website.
Or, you know, if you have any ideas or you have a project that this would really interrelate
with, it would be very helpful.
Yeah.
We definitely are hoping to get a little back and forth with the community.
I mean, we really want to make the community kind of an integral part of where we take
this entire thing.
So definitely if you have ideas or you just want to talk with us or have any additional
questions, yeah, shoot us an e‑mail.
Thanks for having us. Thank you.
Follow me on Twitter, dot slash punk on Twitter, or come to my talk, again, at 3 p.m. on track
one.
All right.
Thank you for coming.
Thanks, everybody.
